knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
library(terra)
# library(fasterize)
library(sf)
library(tidyverse)
library(here)
Here we will use the data provided by Frew in HW3.gdb
and a separately-exported raster Wind2002.tif (since
rasters in geodatabases are not readily accessible except in ArcMap) to
recreate the same basic analysis as we built in Model Builder for
assignment 3. The main steps:
As in other assignments, there are a few ways to do some of these steps, so I’ll show a few options where I can. First, let’s set up a couple of things, analogous to our environment settings.
### Set up a base raster with the CRS, extent, and resolution for our
### analysis. This is based on the wind raster.
base_rast <- terra::rast(here('HW3', 'data/Wind2002.tif'))
### Inspect it:
base_rast
## class : SpatRaster
## dimensions : 411, 564, 1 (nrow, ncol, nlyr)
## resolution : 200, 200 (x, y)
## extent : -61331.61, 51468.39, -404661, -322461 (xmin, xmax, ymin, ymax)
## coord. ref. : NAD_1983_California_Teale_Albers (EPSG:3310)
## source : Wind2002.tif
## name : Wind2002
## min value : 1
## max value : 7
# st_layers(here('HW3', 'data/Basemap.gdb')
county_sf <- read_sf(dsn = here('HW3', 'data/Basemap.gdb'),
layer = 'County')
rivers_sf <- read_sf(dsn = here('HW3', 'data/Basemap.gdb'),
layer = 'MajorRivers')
cities_sf <- read_sf(dsn = here('HW3', 'data/Basemap.gdb'),
layer = 'Cities')
roads_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'CenterLines2008')
### also set the preferred CRS, CA Teale Albers NAD 1987 in meters.
### We can get this from either the base rast or the county sf objects.
ca_alb <- st_crs(county_sf)
Rasters are really pretty nice to work with in R, especially if you
know a few things about how R thinks about them. As
spatRaster objects, they contain a matrix of cell values
and the spatial information (CRS, cell resolution, extent, etc) to map
those cells.
But we can access the cell values directly and manipulate them a bit if we realize that the values can also be considered just a really big vector, reading cell values from the top left, across in a row, then the next row down, and so on until the last value is the bottom right.
So while terra::classify() does the trick for complex
reclassification jobs (we need a matrix to provide the rules for
reclassifying), we can also simply reach in and replace values directly
using indexing, e.g to turn low values of the base raster into NAs:
x <- values(base_rast) ### a one-d array
base_copy <- base_rast
plot(base_copy, col = hcl.colors(9), main = 'before clip')
values(base_copy)[x < 3] <- NA
plot(base_copy, col = hcl.colors(9), main = 'after clip')
We can also use basic arithmetic to calculate things with rasters, as
if they’re just vectors, while keeping in mind that NAs
will poison the calculation (and we can use that to our advantage). Note
the NAs from base_copy are missing when we add it back to
base_rast:
base_math <- base_rast + 2*base_copy
### add twice the base_copy cell values to the base_rast cell values
plot(base_math, col = heat.colors(9))
Click on the tabs to see each model in turn.
Here we can simply use the terra::classify() function
just as we did in ArcMap model builder (with Reclassify). We’ll need to
set up a matrix of original values and the new values. There’s also
another way to do the same thing, shown after
terra::classify().
wind_raw_rast <- rast(here('HW3', 'data/Wind2002.tif'))
plot(wind_raw_rast, main = 'Raw wind raster', axes = FALSE, legend = FALSE)
### reclassification can be "is-becomes" to reassign specific values:
reclass_mtx <- matrix(data = c(c( 1, 2, 3, 4, 5, 6, 7), ### is
c(NA, NA, 1, 1, 1, 1, 1)), ### becomes
ncol = 2, byrow = FALSE)
reclass_mtx
## [,1] [,2]
## [1,] 1 NA
## [2,] 2 NA
## [3,] 3 1
## [4,] 4 1
## [5,] 5 1
## [6,] 6 1
## [7,] 7 1
wind_mask_rast <- terra::classify(wind_raw_rast, rcl = reclass_mtx)
plot(wind_mask_rast, main = 'Reclassified wind raster',
col = 'red', axes = FALSE, legend = FALSE)
But we can also use vector indexing to directly replace the values too!
### first make a copy of the wind_raw_rast so we can change it
wind_mask2 <- wind_raw_rast
### turn low wind values to NA:
values(wind_mask2)[values(wind_mask2) < 3] <- NA
### note, we haven't changed the other values to 1 yet; technically
### that's fine - a mask really only needs to have data cells and
### NA cells. But let's change them anyway - here's a trick I like to
### use to quickly set all non-NAs (and non-zeros) to 1: simply
### divide the raster by itself!
wind_mask2 <- wind_mask2 / wind_mask2
plot(wind_mask2, main = 'Wind mask alt method',
col = 'red', axes = FALSE, legend = FALSE)
### use all.equal to compare aspects of the rasters for equality
all.equal(wind_mask_rast, wind_mask2)
## [1] TRUE
And write one to output:
writeRaster(wind_mask_rast, 'output/wind_mask.tif', overwrite = TRUE)
Let’s read in the roads layer, and create a mask from that
(remembering we need to project it to a new CRS, using
sf::st_transform(), and using our ca_alb CRS
we borrowed from the County layer). The raster package
doesn’t have a nice Euclidean Distance function, and the
distance and distanceToPoints functions like
an old-school spatial object (sp package), rather than an
sf object. So here we’ll use a different method:
st_transform() to project the roads into CA Teale
Albers.st_buffer and set a distance of 7,500
m.(We could do this same method in ArcMap too, using different tools instead of Euclidean Distance and Reclassify).
road_mask_file <- here('HW3', 'output/road_mask.gpkg')
if(!file.exists(road_mask_file)) {
# st_layers('data/HW3.gdb')
roads_raw_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'CenterLines2008')
# st_crs(roads_raw_sf) ### note +units=us-ft instead of meters
roads_project_sf <- roads_raw_sf %>%
st_transform(ca_alb)
### Now, buffer it, union the results (flatten into a single
### feature), then intersect it with the County (not necessary,
### but why not, trims off the buffer that extends beyond the county line)
roads_mask_sf <- roads_project_sf %>%
st_buffer(dist = 7500) %>%
st_union(by_feature = FALSE) %>%
st_intersection(county_sf)
write_sf(roads_mask_sf, road_mask_file, delete_layer = TRUE)
} else {
roads_mask_sf <- read_sf(road_mask_file)
}
plot(roads_mask_sf, main = 'Roads mask', legend = FALSE)
But we can also rasterize this. When putting all the masks together, having them all as rasters can give us some nice advantages. So let’s just do that here while we’re at it.
roads_mask_rast <- terra::rasterize(roads_mask_sf, base_rast)
plot(roads_mask_rast, main = 'Roads mask as raster',
col = 'blue', axes = FALSE, legend = FALSE)
### The vector roads mask has different extensions, so we can save the raster
### as a tif without risk of overwriting the vector files.
writeRaster(roads_mask_rast, here('HW3', 'output/roads_mask.tif'),
overwrite = TRUE)
We can do this process in a similar way we did the roads, using
st_buffer to create the exclusion area (rather
than the inclusion area near roads), and then subtracting that
from the County polygon.
st_buffer and set a
distance of 1609 m (~1 mile)st_difference().Note that I encountered an error:
Error in CPL_geos_op2(op, st_geometry(x), st_geometry(y)) :
Evaluation error: TopologyException: Input geom 1 is invalid:
Ring Self-intersection at or near point ...
These are caused by tiny mistakes in the shapefile that are pretty
much impossible to spot visually, where two polygon vertices land right
on top of each other. There are some tests built into sf:
st_is_valid(), st_is_simple() that will return
TRUE or FALSE for each feature in the polygon
set. We can use these to find the errors… but sometimes our
manipulations (st_union(), st_intersect() etc)
cause new glitches.
An easy (but not intuitive) trick can help fix invalid geometries:
adding a zero-distance buffer (st_buffer(x, dist = 0)).
urban_mask_file <- 'output/urban_mask.gpkg'
if(!file.exists(urban_mask_file)) {
# st_layers(here('HW3', 'data/HW3.gdb'))
urban_2001_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'Urban2001')
urban_2020_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'Urban2020')
# st_is_valid(urban_2001_sf) ### these all look good
# st_is_valid(urban_2020_sf) ### looks like some invalid geometries in there
### can we get around the error by putting a zero buffer on the urban_2020 layer?
urban_2020_fix <- urban_2020_sf %>% st_buffer(dist = 0)
# st_is_valid(urban_2020_fix) ### all are valid now!
### union the two layers, then make a buffer, then flatten that
### using union with only one argument
urban_buffer_sf <- st_union(urban_2001_sf, urban_2020_fix) %>%
st_buffer(dist = 1609) %>% ### 1 mile = 1609 meters
st_union()
### now subtract that from the County layer.
urban_mask_sf <- st_difference(county_sf, urban_buffer_sf)
write_sf(urban_mask_sf, urban_mask_file, delete_layer = TRUE)
} else {
urban_mask_sf <- read_sf(urban_mask_file)
}
plot(urban_mask_sf %>% select('NAME'),
main = 'Urban mask', legend = FALSE)
Again, let’s also make a raster version for later.
urban_mask_rast <- terra::rasterize(urban_mask_sf, base_rast)
plot(urban_mask_rast, main = 'Urban mask as raster',
col = 'brown', axes = FALSE, legend = FALSE)
writeRaster(urban_mask_rast, here('HW3', 'output/urban_mask.tif'),
overwrite = TRUE)
We can do something like the Euclidean Distance tool here. The
terra::distance() function takes a raster input (not a line
or polygon feature) and replaces empty spaces (NA values)
with the distance to the nearest non-NA cell. Since the urban zones are
polygons, we can rasterize those, then use the distance()
function. It’s a little clunky but I wanted to show that it can be
done.
As noted before, we’ll need to fix the invalid geometry in the urban 2020 polygons.
# st_layers(dsn = here('HW3', 'data/HW3.gdb'))
urban_2001_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'Urban2001')
urban_2020_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'Urban2020')
urban_2020_fix <- urban_2020_sf %>% st_buffer(dist = 0)
### union the two layers to a single feature then rasterize
urban_sf <- st_union(urban_2001_sf, urban_2020_fix)
urban_rast <- terra::rasterize(urban_sf, base_rast)
### Now we can use the terra::distance() function to find the
### distance away from the urban areas
urban_dist_rast <- terra::distance(urban_rast)
plot(urban_dist_rast, main = 'Distance from urban areas',
axes = FALSE, legend = TRUE)
### Now, we can reclassify it (though I'll use the indexing version
### because it just seems cleaner to me). First create a copy
urban_mask2 <- urban_dist_rast
urban_mask2[urban_mask2 <= 1609] <- NA
### Finally, let's mask it with the county boundary
urban_mask2 <- urban_mask2 %>%
mask(county_sf)
writeRaster(urban_mask2, here('HW3', 'output/urban_mask2.tif'),
overwrite = TRUE)
plot(urban_mask2, main = 'Urban mask Euc Dist method',
col = 'brown', axes = FALSE, legend = FALSE)
How do the two raster versions compare? The all.equal()
function tells us they’re not identical. Instead, let’s compare the
values by plotting one on top of the other.
all.equal(urban_mask2, urban_mask_rast) ### FALSE! not equal.
## [1] "Attributes: < Component \"ptr\": Component \"range_max\": Mean relative difference: 0.9999698 >"
## [2] "Attributes: < Component \"ptr\": Component \"range_min\": Mean relative difference: 0.9993798 >"
# compareRaster(urban_mask2, urban_mask_rast, values = TRUE) ### error: not all values equal
plot(urban_mask2, col = 'red', axes = FALSE, legend = FALSE)
plot(urban_mask_rast, col = 'blue', axes = FALSE, legend = FALSE, add = TRUE)
You can see tiny rings of red, showing that the
distance() version has a slight difference from the
st_buffer() version. How important is this difference to
our analysis? A one-mile buffer seems sort of arbitrary, so a little
difference one way or the other is probably not a deal-killer.
When working with spatial data, you often have the option to choose between raster formats (and resolution, etc) and vector formats. A good rule of thumb is:
Raster is faster, but vector is correcter.
Again, this builds on what we’ve done with previous layers. We can filter the parcels to only include the public parcels, then subtract those from the county. As with the roads, we’ll need to reproject the layer (at some point) to the CRS of California Teale Albers.
public_mask_file <- here('HW3', 'output/public_mask.gpkg')
if(!file.exists(public_mask_file)) {
# st_layers(here('HW3', 'data/HW3.gdb'))
parcels_sf <- read_sf(dsn = here('HW3', 'data/HW3.gdb'),
layer = 'Parcels')
### transform the CRS, then filter for the public use codes.
### Let's convert the text use codes to numeric, then select
### codes between 6000 and (less than) 9000.
### Let's also flatten it with st_union into a single polygon
### so it is easier for the st_difference() function.
parcels_public <- parcels_sf %>%
st_transform(ca_alb) %>%
mutate(USECODE = as.integer(USECODE)) %>%
filter(USECODE < 9000 & USECODE >= 6000) %>%
st_union(by_feature = FALSE)
### Subtract that from the County layer.
public_mask_sf <- st_difference(county_sf, parcels_public)
write_sf(public_mask_sf, public_mask_file, delete_layer = TRUE)
} else {
public_mask_sf <- read_sf(public_mask_file)
}
plot(public_mask_sf %>% select('NAME'),
main = 'Public mask', legend = FALSE)
And yet again, let’s also make a raster version for later.
public_mask_rast <- terra::rasterize(public_mask_sf, base_rast)
plot(public_mask_rast, main = 'Public mask as raster',
col = 'darkgreen', axes = FALSE, legend = FALSE)
writeRaster(public_mask_rast, here('HW3', 'output/public_mask.tif'),
overwrite = TRUE)
This is nearly identical to the Public mask in terms of methodology. On my first attempt, I noticed an error that causes things to choke:
Evaluation error: TopologyException: Input geom 0 is invalid: Ring Self-intersection at or near point 50040.80700000003 -400081.27190000005 at 50040.80700000003 -400081.27190000005.
To get around this, I dissolve the interior boundaries (using
st_union as a “dissolve” tool) of the two fire layers
before joining them. Sometimes complicated polygon features have
conflicts when you join them or subtract them; ArcMap is good at working
around these conflicts, but R is not so good. But by dissolving a bunch
of polygons into one, we can sometimes avoid creating those conflicts.
If you need to keep track of the attributes, this workaround will not
work for you.
fire_mask_file <- here('HW3', 'output/fire_mask.gpkg')
if(!file.exists(fire_mask_file)) {
# st_layers('data/HW3.gdb')
fire_lra_sf <- read_sf(dsn = here('HW3/data/HW3.gdb'),
layer = 'FireLRA')
fire_sra_sf <- read_sf(dsn = here('HW3/data/HW3.gdb'),
layer = 'FireSRA')
### Filter the SRA to the proper hazard codes, and then union
### the two layers. Again, let's also flatten it with st_union
### so it is easier for the st_difference() function.
fire_sra_filtered_sf <- fire_sra_sf %>%
filter(HAZ_CODE == 3) %>%
st_union(by_feature = FALSE) ### flatten it
fire_joined_sf <- fire_lra_sf %>%
st_union(by_feature = FALSE) %>% ### flatten it
st_union(fire_sra_filtered_sf) ###
### Subtract that from the County layer.
fire_mask_sf <- st_difference(county_sf, fire_joined_sf)
write_sf(fire_mask_sf, fire_mask_file, delete_layer = TRUE)
} else {
fire_mask_sf <- read_sf(fire_mask_file)
}
plot(fire_mask_sf %>% select('NAME'),
main = 'Fire mask', legend = FALSE)
And the raster version.
fire_mask_rast <- terra::rasterize(fire_mask_sf, base_rast)
plot(fire_mask_rast, main = 'Fire mask as raster',
col = 'darkcyan', axes = FALSE, legend = FALSE)
writeRaster(fire_mask_rast, here('HW3', 'output/fire_mask.tif'),
overwrite = TRUE)
This is nearly identical to the Urban mask - I’ll do this with a buffer again.
airport_mask_file <- here('HW3', 'output/airport_mask.gpkg')
if(!file.exists(airport_mask_file)) {
# st_layers(here('HW3', 'data/HW3.gdb'))
airport_sf <- read_sf(dsn = here('HW3/data/HW3.gdb'),
layer = 'Airports')
airport_buffer_sf <- airport_sf %>%
st_buffer(dist = 7500) %>%
st_union(by_feature = FALSE)
### Subtract that from the County layer.
airport_mask_sf <- st_difference(county_sf, airport_buffer_sf)
write_sf(airport_mask_sf, airport_mask_file, delete_layer = TRUE)
} else {
airport_mask_sf <- read_sf(airport_mask_file)
}
plot(airport_mask_sf %>% select('NAME'),
main = 'airport mask', legend = FALSE)
And the raster version.
airport_mask_rast <- rasterize(airport_mask_sf, base_rast)
plot(airport_mask_rast, main = 'Airport mask as raster',
col = 'grey30', axes = FALSE, legend = FALSE)
writeRaster(airport_mask_rast, here('HW3/output/airport_mask.tif'),
overwrite = TRUE)
We can do the final part of the analysis by using the
terra::mask() function (works on sf objects
too, not just rasters!). We should be able to do it in one big
tidyverse style chunk, piping one mask operation straight
into the next.
Note, the order of masks doesn’t matter, since we’re just looking for places where all masks are “on” - with one caveat: as in the Extract from Mask tool in ArcMap, the first argument needs to be a raster, but each link in the chain will spit out a raster that feeds into the next mask call.
all_mask_rast <- wind_mask_rast %>%
terra::mask(roads_mask_sf) %>%
mask(fire_mask_sf) %>%
mask(urban_mask_sf) %>%
mask(airport_mask_sf) %>%
mask(public_mask_sf)
writeRaster(all_mask_rast, here('HW3', 'output/suitable_mask.tif'),
overwrite = TRUE)
plot(all_mask_rast, col = 'green',
main = 'Suitable sites per R',
axes = FALSE, legend = FALSE)
Check it against ArcMap results (though these have IDs on them, so convert to values of 1 or NA):
suitable_key <- rast(here('HW3', 'hw3_arc_outputs/suitable.tif'))
values(suitable_key)[values(suitable_key) > 0] <- 1
plot(suitable_key, main = 'Suitable sites per ArcMap',
col = 'green',
axes = FALSE, legend = FALSE)
We can also use all those rasters we created, and using
c() turn them into a multilayer raster object, layering
them all together, allowing us to do quick calculations on the cell
values. Some of the things you can do with a multilayer object are using
sum() to sum up all the values for each cell, or
prod() to multiply them, or find a mean or standard
deviation. Here we’ll try two things:
na.rm = FALSE;
any cell with NA for any layer will get an NA,
while any cell with a 1 for all layers will get a 1 (product of a bunch
of 1s). This will duplicate the sequence of mask operations above.na.rm = TRUE. This
will not mask per se, but will instead tell us how many of the criteria
are suitable for each cell (including areas where zero criteria
match).mask_stack <- c(wind_mask_rast,
roads_mask_rast,
urban_mask_rast,
public_mask_rast,
airport_mask_rast,
fire_mask_rast)
mask_product <- prod(mask_stack, na.rm = FALSE)
plot(mask_product, main = 'Mask product', col = 'darkblue',
axes = FALSE, legend = FALSE)
writeRaster(mask_product, here('HW3', 'output/suitable_stack.tif'),
overwrite = TRUE)
mask_sum <- sum(mask_stack, na.rm = TRUE)
plot(mask_sum, main = 'Mask sum', col = hcl.colors(6),
axes = FALSE, legend = TRUE)
Check it against ArcMap results. With both the ArcMap version and the R version set so all suitable values are 1, we can subtract one from the other (with na.rm = TRUE); any different values (suitable in one but not the other) will either by +1 or -1 (identical values will cancel to 0).
suitable_key <- rast(here('HW3', 'hw3_arc_outputs/suitable.tif'))
values(suitable_key)[values(suitable_key) > 0] <- 1
diff_rast <- sum(suitable_key, -mask_product)
### sum the values of this difference raster; using absolute
### value, any different cells should result in sum > 0
sum(abs(values(diff_rast)), na.rm = TRUE)
## [1] 0
Now that we’ve found the suitable cells, we can use
terra::patches() to do the same basic thing as the Region
Groups tool in ArcMap, to identify connected raster cells and give them
distinct IDs. Let’s work with the mask_product results.
results_patch <- mask_product %>%
terra::patches(directions = 8) ### 8 = "queen's case", 4 = "rook's case"
plot(results_patch)
To identify the top ten sites by connected area, let’s use some
tidyverse magic. Remembering that a raster can be
considered as just a long vector of cell values, we can create a
dataframe with the raster cell values, then aggregate and count ’em up
to rank by size.
top_ten_df <- as.data.frame(results_patch, xy = TRUE) %>%
### names of columns are "x", "y", and "patches" (because the
### name of results_patch data is called "patches")
rename(patch_id = patches) %>%
filter(!is.na(patch_id)) %>%
group_by(patch_id) %>%
summarize(cells = n(),
hectares = cells * 4,
### also calculate centroid:
mean_x = mean(x),
mean_y = mean(y)) %>%
arrange(desc(cells)) %>%
mutate(rank = 1:n()) %>%
filter(rank <= 10)
Then we can substitute the rank values into the
results_patch raster, using the terra::subst()
function to substitute raster values with new values from the
top_ten_df data frame (kind of like
reclassify()). Then use ggplot() to make a
nicer map than the base plot() function.
top10_rast <- terra::subst(results_patch,
from = top_ten_df$patch_id,
to = top_ten_df$rank,
others = NA) %>%
setNames('rank')
### overwrite layer name from "patches" to "rank"
### ggplot can't work directly on rasters, it needs data.frames. Let's
### also make the rank as a factor so it's categorical.
top10_pts <- as.data.frame(top10_rast,
xy = TRUE) %>%
mutate(rank = factor(rank))
ggplot() +
geom_sf(data = county_sf, fill = 'grey30', color = 'yellow') +
geom_raster(data = top10_pts, aes(x = x, y = y, fill = rank)) +
scale_fill_viridis_d() +
theme_minimal() +
theme(axis.text = element_blank(),
axis.title = element_blank())
We can also turn the ranked results into polygons, giving us more
control over display in ggplot.
raster::rasterToPolygons() turns it into an older spatial
format, so we can coerce it into sf for easier use using
the st_as_sf() function.
top10_poly <- terra::as.polygons(top10_rast) %>%
### convert Terra SpatVector output to simple features:
st_as_sf() %>%
### mutate rank to be categorical, not continuous
mutate(rank = factor(rank))
final_map <- ggplot() +
### Add in county, major roads, cities, rivers
geom_sf(data = county_sf, fill = 'grey30', color = 'black') +
geom_sf(data = roads_sf %>% filter(SPEED_LIM >= 45),
color = 'grey50', size = .1) +
geom_sf(data = cities_sf,
color = 'grey50', fill = 'grey40', size = .1) +
geom_sf(data = rivers_sf, color = 'slateblue', size = .1) +
### Add polygons of top 10 sites, colored by rank
geom_sf(data = top10_poly, aes(fill = rank),
show.legend = FALSE,
color = 'yellow', size = .1) +
### Add labels for cities
geom_sf_text(data = cities_sf, aes(label = CITY),
color = 'grey90', size = 2.5) +
### Add labels for wind sites based on rank
geom_sf_label(data = top10_poly, aes(label = rank),
color = 'white', fill = 'darkred',
size = 3.5) +
scale_fill_viridis_d() +
theme_void()
ggsave(here('HW3', 'img/final_map.png'),
width = 10, height = 6, dpi = 300)
knitr::include_graphics(here('HW3', 'img/final_map.png'))
You may hate the aesthetics I’ve chosen, so play around with them, yay!